Automatic Voice Selection in Japanese based on Various Linguistic Information
نویسندگان
چکیده
This paper focuses on a subtask of natural language generation (NLG), voice selection, which decides whether a clause is realised in the active or passive voice according to its contextual information. Automatic voice selection is essential for realising more sophisticated MT and summarisation systems, because it impacts the readability of generated texts. However, to the best of our knowledge, the NLG community has been less concerned with explicit voice selection. In this paper, we propose an automatic voice selection model based on various linguistic information, ranging from lexical to discourse information. Our empirical evaluation using a manually annotated corpus in Japanese demonstrates that the proposed model achieved 0.758 in F-score, outperforming the two baseline models.
منابع مشابه
Listening between the lines: a study of paralinguistic information carried by tone-of-voice
This paper describes a study of speaking-style characteristics, or “tone-of-voice”, in conversational speech, and shows that non-verbal information is transmitted efficiently regardless of cultural and linguistic contexts through differences in prosodic and voice-quality features. We asked Korean and American listeners with no previous kowledge of Japanese to judge the meaning of various uttera...
متن کاملJapanese Hyponymy Extraction based on a Term Similarity Graph
Semantic relations between words, such as hyponymy, synonymy and meronymy, have various information access applications (e.g. Web search) and the automatic extraction of such relations from corpora is an important research problem in natural language processing. For the Japanese language, there exist several linguistic resources that contain these relations, such as the Japanese Wordnet, Nihong...
متن کاملAutomatic Knowledge Acquisition for Case Alternation between the Passive and Active Voices in Japanese
We present a method for automatically acquiring knowledge for case alternation between the passive and active voices in Japanese. By leveraging several linguistic constraints on alternation patterns and lexical case frames obtained from a large Web corpus, our method aligns a case frame in the passive voice to a corresponding case frame in the active voice and finds an alignment between their c...
متن کاملAnalysis of Autocorrelation-based Parameters for Creaky Voice Detection
Creaky voice carries important linguistic and paralinguistic information. Parameters based on autocorrelation of the glottal excitation waveform are proposed for automatic detection of creaky voice in spontaneous speech. Analysis results show the ratio of the first two peaks of the autocorrelation function as a primary parameter to detect creaky voice.
متن کاملAttention, Sobriety Checkpoint! Can Humans Determine by Means of Voice, if Someone is Drunk... and Can Automatic Classifiers Compete?
This paper analyzes the human performance of recognizing drunk speakers merely by voice and compares the results with the performance of an automatic statistical classifier. The study is carried out within the Interspeech 2011 Speaker State Challenge [1] employing the Alcohol Language Corpus (ALC) [2]. The 79 subjects yielded an average performance of 55.8% unweighted accuracy on a balanced int...
متن کامل